Permuting Web and Social Graphs

نویسندگان

  • Paolo Boldi
  • Massimo Santini
  • Sebastiano Vigna
چکیده

Since the first investigations on web graph compression, it has been clear that the ordering of the nodes of the graph has a fundamental influence on the compression rate (usually expressed as the number of bits per link). The authors of the LINK database [2], for instance, investigated three different approaches: an extrinsic ordering (URL ordering) and two intrinsic orderings based on the rows of the adjacency matrix (lexicographic and Gray code); they concluded that URL ordering has many advantages in spite of a small penalty in compression. In this paper we approach this issue in a more systematic way, testing some known orderings and proposing some new ones. Our experiments are made in the WebGraph framework [3], and show that the compression technique and the structure of the graph can produce significantly different results. In particular, we show that for the transposed web graph URL ordering is significantly less effective, and that some new mixed orderings combining host information and Gray/lexicographic orderings outperform all previous methods: in some large transposed graphs they yield the quite incredible compression rate of 1 bit per link. We experiment these simple ideas on some non-web social networks and obtain results that are extremely promising and are very close to those recently achieved using shingle orderings and backlinks compression schemes [4].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Community Base on Web Graph Clustering

Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...

متن کامل

Permuting Web Graphs

Since the first investigations on web graph compression, it has been clear that the ordering of the nodes of the graph has a fundamental influence on the compression rate (usually expressed as the number of bits per link). The author of the LINK database [1], for instance, investigated three different approaches: an extrinsic ordering (URL ordering) and two intrinsic (or coordinate-free) orderi...

متن کامل

Cayley Color Graphs of Inverse Semigroups and Groupoids

The notion of Cayley color graphs of groups is generalized to inverse semigroups and groupoids. The set of partial automorphisms of the Cayley color graph of an inverse semigroup or a groupoid is isomorphic to the original inverse semigroup or groupoid. The groupoid of color permuting partial automorphisms of the Cayley color graph of a transitive groupoid is isomorphic to the original groupoid.

متن کامل

Applying Semantic Social Graphs to Disambiguate Identity References

Person disambiguation monitors web appearances of a person by disambiguating information belonging to different people sharing the same name. In this paper we extend person disambiguation to incorporate the abstract notion of identity. This extension utilises semantic web technologies to represent the identity of the person to be found and the web resources to be disambiguated as semantic graph...

متن کامل

Interlinking Distributed Social Graphs

The rise in use of the social web has forced web users to duplicate their identity in fragmented information spaces. Commonly these spaces contain rich identity representations hidden within walled garden data silos. This paper presents work to export social graphs from such data silos as RDF datasets, and provide linkage between these social graphs according to a graph matching paradigm. Our w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Internet Mathematics

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2009